A generalized disambiguation algorithm for weighted finite automata and its application to NLP tasks

نویسندگان

  • Katsuhiko Hayashi
  • Masaaki Nagata
چکیده

We present a disambiguation algorithm for weighted finite tree automata (FTA). This algorithm converts ambiguous FTA into equivalent non-ambiguous one where no two accepting paths labeled with the same tree exists. The notion of non-ambiguity is similar to that of determinism in the automata theory, but we show that disambiguation is applicable to the wider class of weighted automata than determinization. We conduct experiments on Natural Language Processing (NLP) tasks, and also show that disambiguated automata become much smaller than determinized automata in practice.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The correctness of a generalized disambiguation algorithm for finite automata

We present a generalized disambiguation algorithm of finite state automata, and show a proof of its correctness. This algorithm can remove ambiguities of finite state and tree automata. Our proposed algorithm can make finite state and tree automata more efficient to use in many applications.

متن کامل

On the Disambiguation of Weighted Automata

We present a disambiguation algorithm for weighted automata. The algorithm admits two main stages: a pre-disambiguation stage followed by a transition removal stage. We give a detailed description of the algorithm and the proof of its correctness. The algorithm is not applicable to all weighted automata but we prove sufficient conditions for its applicability in the case of the tropical semirin...

متن کامل

A disambiguation algorithm for weighted automata

We present a disambiguation algorithm for weighted automata. The algorithm admits two main stages: a pre-disambiguation stage followed by a transition removal stage. We give a detailed description of the algorithm and the proof of its correctness. The algorithm is not applicable to all weighted automata but we prove sufficient conditions for its applicability in the case of the tropical semirin...

متن کامل

A Better -Best List: Practical Determinization of Weighted Finite Tree Automata

Ranked lists of output trees from syntactic statistical NLP applications frequently contain multiple repeated entries. This redundancy leads to misrepresentation of tree weight and reduced information for debugging and tuning purposes. It is chiefly due to nondeterminism in the weighted automata that produce the results. We introduce an algorithm that determinizes such automata while preserving...

متن کامل

A Better N-Best List: Practical Determinization of Weighted Finite Tree Automata

Ranked lists of output trees from syntactic statistical NLP applications frequently contain multiple repeated entries. This redundancy leads to misrepresentation of tree weight and reduced information for debugging and tuning purposes. It is chiefly due to nondeterminism in the weighted automata that produce the results. We introduce an algorithm that determinizes such automata while preserving...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014